Monotone minimal perfect hashing: searching a sorted table with O(1) accesses

نویسندگان

  • Djamal Belazzougui
  • Paolo Boldi
  • Rasmus Pagh
  • Sebastiano Vigna
چکیده

A minimal perfect hash function maps a set S of n keys into the set { 0, 1, . . . , n− 1 } bijectively. Classical results state that minimal perfect hashing is possible in constant time using a structure occupying space close to the lower bound of log e bits per element. Here we consider the problem of monotone minimal perfect hashing, in which the bijection is required to preserve the lexicographical ordering of the keys. A monotone minimal perfect hash function can be seen as a very weak form of index that provides ranking just on the set S (and answers randomly outside of S). Our goal is to minimise the description size of the hash function: we show that, for a set S of n elements out of a universe of 2 elements, O(n log logw) bits are sufficient to hash monotonically with evaluation time O(logw). Alternatively, we can get space O(n logw) bits with O(1) query time. Both of these data structures improve a straightforward construction with O(n logw) space and O(logw) query time. As a consequence, it is possible to search a sorted table with O(1) accesses to the table (using additional O(n log logw) bits). Our results are based on a structure (of independent interest) that represents a trie in a very compact way, but admits errors. As a further application of the same structure, we show how to compute the predecessor (in the sorted order of S) of an arbitrary element, using O(1) accesses in expectation and an index of O(n logw) bits, improving the trivial result of O(nw) bits. This implies an efficient index for searching a blocked memory.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Theory and Practise of Monotone Minimal Perfect Hashing

Minimal perfect hash functions have been shown to be useful to compress data in several data management tasks. In particular, order-preservingminimal perfect hash functions [12] have been used to retrieve the position of a key in a given list of keys: however, the ability to preserve any given order leads to an unavoidable (n log n) lower bound on the number of bits required to store the funct...

متن کامل

Hash and Displace: Efficient Evaluation of Minimal Perfect Hash Functions

A new way of constructing (minimal) perfect hash functions is described. The technique considerably reduces the overhead associated with resolving buckets in two-level hashing schemes. Evaluating a hash function requires just one multiplication and a few additions apart from primitive bit operations. The number of accesses to memory is two, one of which is to a fixed location. This improves the...

متن کامل

A Practical Minimal Perfect Hashing Method

We propose a novel algorithm based on random graphs to construct minimal perfect hash functions h. For a set of n keys, our algorithm outputs h in expected time O(n). The evaluation of h(x) requires two memory accesses for any key x and the description of h takes up 1.15n words. This improves the space requirement to 55% of a previous minimal perfect hashing scheme due to Czech, Havas and Majew...

متن کامل

Indexing Internal Memory with Minimal Perfect Hash Functions

A perfect hash function (PHF) is an injective function that maps keys from a set S to unique values, which are in turn used to index a hash table. Since no collisions occur, each key can be retrieved from the table with a single probe. A minimal perfect hash function (MPHF) is a PHF with the smallest possible range, that is, the hash table size is exactly the number of keys in S. MPHFs are wide...

متن کامل

DCuckoo: An Efficient Hash Table with On-chip Summary

Hash tables are extensively used in many computer-related areas because of their efficiency in query and insert operations. However, hash tables have two disadvantages: collisions and memory inefficiency. To solve these two advantages, Minimal Perfect Hash Table uses N locations to store N incoming elements. However, MPHT doesn’t support incremental updates. Therefore, in this paper, combining ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009